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PREFACE 

This paper is intended for those unfamiliar with the method of 
least squares. It was written to present completely a few examples 
of easily programmed equations which serve a variety of purposes 
and to introduce without extensive development the more complex 
general non-linear equation treatment. 
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THE THEORY AND APPLICATION OF LEAST SQUARES 


Simple Linear Equation 


A problem area in science is that of determining the proper formula for 
relating observations. Even when an equation which correlates experimental 
data is known, the best scaling and transformation coefficients are initially 
uncertain. 

For example, assume a simple set of observations of X and a corresponding 
set of measurements of Y are plotted: 


Y 


12 OBSERVATIONS 
AND MEASUREMENTS 



Figure 1 


X 


A reasonable assumption would be that there is a linear relationship between X 
and Y. That is, 


Y - AX + B 


( 1 ) 


This is an equation for a straight line. But what choice of A and B best repre- 
sents the data? For any two choices of data points, a solution for A and B can 
be obtained from (1). This supplies a straight line through those two points. 
For fig. 1, this is 12 !/10 !2 ! = 66 possibilities. Certainly, it would be too time 
consuming examining each possible line with respect to the others to see which 
one appears best. Often a straight edge is aligned visually on the plot in an 
approximation of a close fit. 
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For many purposes either of the above rough techniques may be satisfactory; 
however, a more analytic treatment is normally preferred. This treatment should 
be an objective, general approach. That is, the problem is stated, equations pro- 
posed, observations or measurements made, and solutions derived from the 
equations. 

P^or equation (1) and fig. 1, the problem is to determine by a mathematical 
algorithm the best straight line for the given points. Twelve observations relating 
X and Y are given. The coefficients A and B of (1) are, therefore, overdetermined. 

Yj % AXj + B 

y 2 % ax 2 + b 


(2a) 


Y 

n 


% AX n 4 B 


The general set of n equations is represented in (2a). Our example from fig. 1 
has n = 12. The approximate nature of the equations is eliminated by explicitly 
including the residuals A. , which are the differences from observations Y i and 
the straight line computed values (AX i +B), 

Yi - (AX a +B) =A. (3) 

for i = 1, 2, . . . , n. The equations (3) are known as condition equations . We 
should like to find values for A and B such that a minimum error in Y could be 
expected for additional observations of X . 

A method which achieves this aim is that of least squares, so named because 
the object is to minimize the sum of the squares of the residuals or observation 
vs. computation errors. That is, 



[Yi - (AXj + B )] 2 = 



minimum . 


( 4 ) 
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Expanding (4) gives, 


n n n n n n 

V^Yf + a 2 Xf + nB 2 - 2A ^ Y.X. -2B Y i + 2AB J^X i = /Af . 

i-1 i = 1 i rrl i = l i = l i=l 


(4a) 


n 

To minimize ,2 i A? with respect to particular coefficients A and B , the partial 
derivatives of £ A? with respect to A and B must be zero. 


a £ a? 

i = 1 

3A 


n 

0 = 2A ^~ l Xf - 2 

i = 1 


i = l i = 1 


X. 


(5) 


3 £ A? 

— = 0 = 2nB - 2 

3B 



( 6 ) 


where (5) and (6) are normal equations . Thus, with two equations in two unknowns, 
a solution is given by: 


B = 



i = l 


i=i 



n 



= 0 


3 


or 


nA 


n n n n / n 

Z>- !>■>* Z> Z>- (Z> 

*U1 i=l i = l i = l \ i = l 


= o 


whence, 


Then, 


n n n 

n £ y i x i - £ Y . £ 

i = l i=l i = 1 

- £ *?-(£ 


(7) 



whence, 


B 


n n n / _n \ 

5 5 ~ SWS’v 


Equation (2) can be expressed in matrix notation. 


X 1 

n 

( n x 2 ) 


L. Y n J 


(2x1) (n x 1 ) 


( 8 ) 


4 



i 


But this is just ASX? + BSXj = SXj Y t and A£X i + nB = SY { , equation (5) and 

( 6 ). 

Q.E.D. 

Singularity of a matrix is indicated if the determinant of it is zero, so that 
if X T X is nonsingular, then an inverse exists and P can be solved for by pre- 
multiplying (10) by (X T X)"\ the inverse of X T X. 

(X T X) _1 (X T X) P = (X T X)' 1 X T Y (11a) 


P = (X T X) _1 X T Y 


(lib) 


A 

B 


n 


n 



i — 1 







n 


n 


n 


L 

i 






(lie) 


However, this is equivalent to (7) and (8). 


As was indicated before equation (9), an analytic justification is necessary 
to prove that the matrix representation of (9) is equivalent to a least squares 
treatment. Observe that the condition equations (3) were not given an exact 
expression in a matrix form. This is simply done by: 



(12a) 
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A simplification of the solution is achieved by replacing the approximate sign by 
an equal sign (the justification for this step is given on page #6). Matrices may 
then be expressed by 


X P = Y ( 9 ) 

(n x 2 ) (2x1) ( n x 1 ) 

Because the matrix X is not a square one (n x n), inversion directly is not possible. 
This dilemma is resolved by premultiplying (9) by the transpose of X . 

X T X P = X T Y ( 10 ) 

v (2,n) (n x 2) (2x1) (2 x n) (n x l) 

v J 

(2x2) 


This is the matrix form of the normal equations (5) and (6) . Because the trans- 
pose matrix product of a matrix is a square matrix, the product matrix (X T X) is 
a 2x2square matrix. 

We are justified in calling both (10) and the set (5) -(6) normal equations be- 
cause they are equivalent. This is shown by expanding (10). 


X T X 



X t Y 



x i 1 


Y i“ 


X, 1 


Y 2 

[Xl X 2 **• X nl 

2 


-a] [x, x 2 ■ ■ • xi 


Li i • • • ij 

• • 


bJ Li i • ■ • iJ 

• 


• • 


L -J 

* 


X 1 

n 


Y 

n 


n n 

z> z> 

i=l i =1 

p 


i=i 

1 

C 

|_i 

.J ‘ 

i 

i 
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or 


Y - XP = V. (12b) 

Equations (12) are the matrix form of the condition equations (3). For matrices, 
the sum of the squares of the residuals is represented by: 

(Y - XP) T (Y - XP) = V T V. (13a) 

Manipulating (12) by matrix rules yields, 

Y T Y - Y T XP - P T X T Y + P T X T XP = V T V (13b) 

but for differentiable matrices, V T V may be minimized with respect to P by: 

iXlX = 0 = - (Y t X) t - X t Y - 2X t XP (14) 

0 = 2X t XP - 2X t Y (14a) 

which may be solved for P to give 

P = (X 1 ^)- 1 X t Y. (15) 

This is just our result (10), justifying our simplification made at that time. 
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Three Dimensional Orthogonal Transformation 


A second example which is more complex, more general, and more interest- 
ing is the three-dimensional orthogonal vector transformation (Orthogonality 
implies mutually perpendicular axes). This transformation serves to rotate and 
translate a vector R into another vector R' (fig. 2). Another way of viewing 
the transformation is as a rotation and translation of a coordinate system into 
another reference frame (fig. 3). 



Figure 3 


An axial rotation is a rotation of a frame about one of the axes X, Y , or Z. 
At most, three axial rotations, each representable as a direction cosine matrix 
in form 


-s i n 


0 \ 


cos 0 sin 0 


'■ o s 0 J 



-s 


in f?\ 


0 

cos 0 


or 


/ 


cos 6 s in 6 O' 

-sin 8 cos 6 0 

0 0 1 
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are necessary to perform any transformation. We shall specifically examine 
a set of three axial rotations defined by: 

(1) rotation around X axis by roll angle co 

(2) rotation around new Y axis by pitch angle 0 

(3) rotation around newest Z axis by yaw angle k . 

A large number of possible choices exist however, e.g. 


pitch ( 0 ), roll (co), yaw (k) about Y- X - Z 
heading (H), roll (co), pitch ( 0 ) about Z - X - Y 
heading (H), pitch ( 0 ), roll (w) about Z - Y - X 
azimuth (a), tilt (t), swing (s) about Z - X - Z 
azimuth (a), elevation (h), swing (s) about Z- Y - Z etc. 


The direction cosine matrices are represented by: 


Aj = | 0 cos » sin » 
0 -s in co- cos CO' 


A 2 = 


/ cos 0 0 -sin 0^ 

0 1 0 
Y s in0 0 cos 0 


cos k sin k O' 
A 3 = | -sin k cos k 0 


A vector R can now be transformed into R' by a series of three axial rotations: 


(16) 


Rj = Aj R ; R 2 = A 2 R x ; R' = A 3 R 2 . 


(17) 
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Instead of making three separate steps, a single rotation represented by direction 
cosine matrix A = A 3 A 2 A x is sufficient. ff A ff is thereby a product matrix (3x3) 
of three orthogonal matrices and is therefore orthogonal itself. 


( COS 4> COS K 

-cos 4> sin/< 
s'mcfc 


cos (x) sin k +sinr^ sinc£ cos/< 
coso) cos k -sir'll sin0 sinx 
“Sino,> cos^ 


sino; sin k -cos a> sin <fi cos k 
sin cosk +coscl> sin<£ sin* 
cos cos <f> 


(18) 


Note: All rotations considered in this section are counter clock-wise in accord 
with current scientific convention. 

A review of a few properties of orthogonal direction cosine matrices may be 
useful. 

If A transforms vector 


then 


where 



X 


“x“ 

R < = > 

Y 

into R' < = > 

Y* 


Z 


Z' 


R' = AR 


(19) 


A = 


11 

a i2 

3 1 3 


/cos X'X 

cos 

X'Y 

cos 

X'Z 





1 _ 


H 


_ 

21 

a 22 

a 23 

= 

cos Y'X 

cos 

Y'Y 

cos 

Y'Z 





l _ 


„ 


_ 

31 

a 32 

a 33^ 


\ cos Z'X 

cos 

Z'Y 

cos 

Z'Z 


(18a) 


That is, the elements of a direction cosine matrix are the cosines of the angles 
between two axes, one from each reference frame. 
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Thus the relations a n 
s inw s in<£ cos k , . . . 
the angles between the 
of the rotation angles c*> 


Z, Z 



via Aj ( [co- ) 


2 



= cos X'X = cos0cos/< , a J2 = cosX'Y' = coswsin k + 
a 33 = cos Z'Z = cos co. cos 4> supply the dependence of 
X - Y - Z axes and the X' - Y' - Z' axes as functions 
4> , «. Explicitly, "A" is produced by: 




via A 2 (0) via A 3 ( k ) 
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For any orthogonal matrix A , 


3 



kj 


= 8 


ik 


holds for rows: 


i = 1 , 2 , 3 ; k = l,2,3 




i j 


ik 


= 8 . 


1=:1 


holds for columns: 


j =1,2,3; k = l,2,3 


where 


8 ., = 1 for i-k; 8., = 0 for i^k 

i k ik 

S. k - 1 for j =k; S jk = 0 for i 


det A = | A | = ± 1, +1 when a right (left) -handed coordinate system is trans- 
formed into another right (left) -handed system 



and - 1 when a right (left) -handed system is transformed into a left (right)-handed 
one. 
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For our purposes, we wish to choose a transformation 



R' = A R. 


(19a) 


(19b) 


The peculiar choice of elements for the direction cosine matrix will be useful 
in a later development. Now we turn to the solution of (19) by least squares. 


If n vectors are measured in both of two coordinate systems (where n > 3, 
since each vector supplies 3 points) , 




*1 

X 2 • • • 



(*' 

X 2 

y; 

Y' * * • 

Y' 

= A 

Y, 

Y • • • 

1 

2 

n 

J 


l 

. 

2 

Z' 

7' • • • 
/w 2 

r °! 


\ z - 

z • • • 


(3 X n) 


(3x3) 

(3 x n) 



R' = A R . 




(20 a) 


( 20 ) 


The subscript n denotes the number of vectors in the matrix. To make the right- 
hand side invertible, post multiply by . 


R' R t = A R R t (21) 

n n n n 

We may solve for A as we did for P in the previous example. 


R' R T (R R^)- 1 = A R R t (R R 1 )- 1 

nn v nn / nn v nn / 

(22a) 

A = R' R t (R RT)-i 

n n x n n ' 

(22b) 
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Explicitly, 


i 

f\ X 2 * * X n\ 


/ X ! Y ! Z l\ 

II 

1 

Y Y • • • Y 

X 1 1 2 1 n 


X 2 Y 2 Z 2 

1 

Z 2 • • • Z n / 

lx Y Z j 

\ n n n / 



n 



i=l 






n 

x' Y i 

i = l 


n 

Y' Yi 

i = l 


Z' Y. 

i i 

i = 1 




z_, 
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Projective Transformation 


'A 


Consider a scaled form of equation (18), 



(23a) 


We wish to project the points in an xy plane onto points in an XY plane. That is, 
a projection with Z = z = constant. We thus have: 



+ 




z 





> 



(23b) 


Moreover, if the constant Z = z is chosen to be one, 


Ax + By + C 
Dx + Ey + I 


Fx + Gy + H 
Dx + Ey + I 


(24) 
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For an orthogonal-conformal transformation, as I approaches the value of one, 

D = E= C = H = 0. If the requirement for conformality is now relaxed such that 
D, E , C , and H may not be zero for 1=1, then 


Ax + By + C 
Dx + Ey + 1 


(25) 


Y = Fx + Gy + H 
D x + Ey + 1 


(26) 


This is an eight parameter nonconformal projective transformation. It is used 
for mapping planar surfaces into other planar surfaces. 



Figure 7 


Specifically, an oblique photograph from the air may be transformed point by 
point into a map of the same area. Also, a linearly distorted and scaled network 
or grid may be correlated to a reference grid. 

We shall be concerned with how the best choice of the coefficients A through 
Hare made. Assume a network with n grid intersections which are measured. 
Let this be a reference grid. Similarly measure the n intersections of a dis- 
torted grid. The ith reference grid intersection is denoted by x. , y. and the 
corresponding ith distorted grid intersection is denoted by X A , Y i . Since there 
are eight parameters to be determined, there must be at least four measured 
intersections (each intersection supplies two equations). A "good” pattern of 
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points is recommended. That is, not all points from the same area, in a line, 
or in any way likely to create a bias in the solution should be chosen. After a 
solution for the coefficients is made, any arbitrary point within the point pattern 
on the distorted grid may be measured and transformed into the reference grid. 



Xi.Yi 



DISTORTED GRID 


Figure 8 


In many cases of interest, orthogonality and conformality are almost preserved; 
that is, A = G , B = -F, D»E» 0. 

The following 2n equations are formed when n intersections from reference 
and distorted grids are measured. 


Axj + Byj + C 
1 D Xj + E + 1 


Fx + Gy + H 
1 D Xj + Ey t + 1 


X 


2 


Ax 2 + By 2 + C 
Dx 2 + Ey 2 + 1 


F x 2 + Gy 2 + H 
Dx 2 + Ey 2 + 1 


(27) 


X 


n 


Ax n + By n + C 

Dx n +E Tn + 1 


Y 

n 


Fx n + Gy n + H 
Dx n +Ey n + 1 
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Expressed as condition equations, 


Xj - (Axj + By x + C - Dx x X x - Ey x X x ) = A n 

X 2 - (Ax 2 + By 2 + C - Dx 2 X 2 - Ey 2 X 2 ) = A 12 

• • • 

• • • 

• • 

X - (Ax + B y + C - D x X - Ey X ) = A, ( 28 ) 

Y x - (Fxj + Gy x +H-Dx 1 Y 1 -Ey 1 Y 1 ) = A 21 
Y 2 - (Fx 2 + Gy 2 + H - Dx 2 Y 2 - Ey 2 Y 2 ) = A 22 

Y - (F x + G y +H-Dx Y - Ey Y ) = A, 

Square the residuals A . . , sum over j and i , and minimize with respect to the 
coefficients. 


9 . n O n On 



Once again by resorting to matrix notation, the solution is simplified. 
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(2n x 8) 


x i Vi 1 " x i X i 

x 2 y 2 1 " X 2 X 2 

x y 1 -x X 

n J n n n 

0 0 0 -Xj Yj 

0 0 0 -x 2 Y 2 

0 0 0 -x Y 

n n 


-y, Xj 0 0 0 

-y 2 X 2 0 0 0 

-y X 0 0 0 

-y i Y i x i y i * 

_y 2 Y 2 X 2 y 2 1 

• • » • 

• • • • 

-y Y x y 1 

J n n n J n 


X P = Y 


Again, square matrix X by premultiplying by the trai 

X T X P = X T Y 
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where X T X 


>- 

X~ 


>T 


u i u]- -kj- ^ i 


>f 

x" 


>* 

°>r 


>> >> 


•> ]- W J; UJ’ W- 


>H 

>r 

x" 


>> 

x~ 


a W= =t~j3 W =UJ: ‘kP 


X 

>r 


<N -i 
>> 


X 

>r 


^ C W- =w^ 




>« 

>r 

x" 

■w 


>■ 
CN •- 
>> 


>< 

>f 


=kj: -kj- 


x 

>T 

x H 


X 

x - 


■Wj -UJ: "W 


f (N-. CM — 

CM- CN-, X > 

X >« 

CN-- CN — >> >1 


> 

CN .-I 

X 


>< 

>r 

X 


>« 

X** 


■w ; "UJ- W: 


>> 

X 


X 

>r x - 

X 

>r 

Ul ■ kp 

1 

i 

*T 

CN-i - >t 

>> >> 

X 

: W- W? C^J3 

i 

. -Z> 

i =1 

x" 

>» ■-< CN-< 

X X 

Kl -W2 -W= 

« 
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and 


X T Y = 


L 

L 



Inverting (X T X) and premultiplying into (30), 


(X T X)- 1 X T X P = (X T X)- 1 X T Y 

(31a) 

P = (X T X)- 1 X T Y 

(31b) 


To reemphasize a statement previously made, points computed from the coefficient 
parameters should lie within or very near the spread of points used for calcu- 
lating the coefficients. 


The projective transformation serves to analytically project a plane (x , y ) 
onto another (X , Y ) where one plane is tilted with respect to the other and the 
rays of the projection may be viewed as parallel, diverging, or converging. 
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N Degree Polynomial 


As a third example, the general n degree polynomial transformation is 
illuminating. In many situations, data points can be plotted such that a simple 
curve can adequately represent the observations. 

Y = a Q + a 1 x + a 2 x 2 + • • • + a n x n (32) 



Figure 10 


A least squares solution for coefficients a. can be formed whenever m > n 
observations in X are made. 


Yl ~ a 0 + 3 l X l + a 2 X l + ' " + 3 n X ? 

y 2 " a 0 +a i X 2 +a 2 X 2 + +a n X 2 

(33) 


a 0 + 3 1 X m 


+ a, x^ + 

2 m 


+ a 


X 


n 

m 


for m ^ n. 
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The condition equations in matrix notation are: 



Y - X A = V. 


Normal equations are formed via, 

- A t X t Y + A t X t X A + Y t Y - Y t X A = V T V . 
Minimizing V T V as before (eq. (14)) gives normal equations, 


X T Y - X T X A = 0. 


Whence, 


A = (X T X)- 1 X T Y, 

where 


m 

m 

z> 

i — 1 

m 

• £*" 
i^l 

m 

Z> 

" i s 1 

m 

i = 1 

m 

i =1 

m 

• ■ £ x ‘ ,! 
i=i 

tn 

Z> 

i = l 

m 

i=l 

m 

. ^ .... 
1st 

t = i 

£ x ' ltl 

L = 1 

X>“ ■ 

i = l 

m 

£> 

i =1 


( 34 ) 


(34a) 


(35) 


(36) 
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For demonstrative purposes, the above will be hand computed. 


Vi - ( a o + a i x i + " ' +a n x i> = A i 

y 2 - ( 3 0 + a i X 2 + + a n X 2> = A 2 

(34b) 

y - (a. + a, x + • • • + a„ x") = A 

v 0 1 m n m' m 


Squaring and summing the residuals, 


m m m m 

22 a ? = 22 + m a ° + 2 a ° a i 22 Xi + 2 a ° a 2 

i =1 i =1 i=l l -1 


m 



+ 2 a 2 a n 22 x ? +2 + “ ■ + a ' 22 x2n ■ 2 a ° 22 Vi 

-2aj 22 - 2a 2 22 x?yi ■ *“ ' 2a " 22 x ' Yi ' (37) 


To minimize the sum of the squares, find the first partials with respect to the 
coefficients a i and equate each to zero. 
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tn 


d a„ 


m m . m J2-, 

2 m a Q + 2 aj x. + 2 a 2 xj + ■ • * t 2a n / ^ x " “ 2 / y i = 0 


Z> 

i = 1 


d a. 


m m ™ Vi 

2a 0 7] *. +2a, ^ x ?+ 2a 2 £ xf , • • • + 2 a„ ^ x-> - 2 ^ *. V. ; = 

1=1 i - 1 ■ = > 1 = 1 


(38a) 


m 

Z> 

1=1 


m m m m__^ m 

2a 0 ^ x " + 2a t ^ x^ + 1 + 2 a 2 ^ x^ 2 + • • • + 2 a n ” 2 4_, V ‘ 


This is n+1 equations inn +1 unknowns a. ( j = 0, 1 n). 

The multiplicative factor 2 may be removed, allowing (38a) to be expressed in 
matrix form, 


m 

m 

i =1 

m 

I*! 

i=l 

m 

• • ■ £> 
i=l 

m 

Z> 

i = 1 

E'‘ 

E '■ 

••• E'- 

m 

L* 

2> 

L< 

■■■ Z> 


(n + 1) x (nil) 


m 

E < E'"‘" E ■ ■ • I> 


1 1 1 


y2 y2 

X 1 *2 3 


(n +l)xl 


«, n v n v n . • > y^ 1 

X 1 X 2 X 3 m 


(n + 1) x m 


(38b) 


m x 1 
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Which can be expressed as 


(X T X) A = X t Y. 


(38c) 


Continue solution as for eq. (36). 
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General Non-Linear Equatio n Approach 

For our last development, we wish to derive a general approach whereby 
non-linear equations can be solved by least squares. A Taylor's expansion 
linearization will be used leading to approximate solutions. By an iterative 
process, the approximate solutions will be improved until changes in the solution 
become less than a predetermined limit. The generality is further increased 
by also allowing a limited number of coordinates x k to be better determined 
than from observation. For i = maximum number of coefficients to be 

max 

found and k = maximum number of coordinates to be re-evaluated, m > j + 

max max 

k is necessary for a least squares method to work (m = number of observa- 

max 

tions). 

Assume a general function, 


X. = X. (a., x k ) (39) 

A 1 J ^ 

This is expressible as a polynomial in the coefficient parameters a. and co- 
ordinates x k . 


r 




w 


Y Y 
12 


Z 1 Z 2 


Form condition equations, 


r 


X 1 X 2 


a. {ABC 


x u < 


y,y 


1 J 2 




Z 1 Z 2 


f <e<V 


x k ) = 


(40) 


As implied, linearization can be achieved by expanding the condition equations 
in a Taylor's Series and dropping second and higher order terms. 


°=f e ° + 




(40a) 
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aj! = initial or interated value of a. , x° = initial or iterated value of x k , 
such that f£ = fj> (X., a° , x° ). 1 

For notational simplicity let 



da° 

1 


and 


0 = h = k ff,. K* V =*i 

In matrix notation the l condition equations become: 


(40b) 


* / 

V “**/ 


t XI 

max 


'Hi 

3x° 


AM /Hi 

/ o \ 3x? 


3x? 


Hi 

3x° 


dfi Sty 


3x? 


3x? 


3f, 


Hi 

3x,° 


ate 


3x“ 


e xk 

max max 


h \ 


\ Vk max/ 


k X 1 

max 


3a? 


Hi 

3a? 


3a? 


3f, 

3a° 


Hi 

3a, 


3f^ 3f^ 


3a? 


H«. 

3a° 


Hi 

3a? 


3 ft 


3a° 


l X j 

max max 


/:■ \ 


w 

\ ^ma-xj 


j XI 

max 


(40c) 


0 = F + AV + BA 


(40d) 



The matrices of (40d) relate directly to the ones of (40c) and are specifically: 

A , the matrix of partials with respect to coordinates 
B , the matrix of partials with respect to coefficients 
V , the matrix of residuals of measurements 
A , the matrix of parameter corrections 
F , the matrix of initial or iterated estimates of function. 

In this case, the sum of the squares of the residuals must be minimized with 
respect to the coordinate and coefficient parameters. The parameter correc- 
tions A are needed to do this. 

Consider, 


S = V T P V, (41) 

the matrix form of the weighted sum of the squares of the residuals. Thus P is 
a weighting matrix which is the identity matrix when all measurements are con- 
sidered to be equally good. 

To introduce an explicit dependence of S on the condition equations, equa- 
tions of constraint 



are inserted in 


S = V T P V - 2 A t (F + A V + B A) 

v Y _ < 

= o 

where 


A T = f A-i \ 2 



(42) 
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and the individual A.^ are Lagrangian undetermined multipliers. S is now a 
function of the residuals and of the parameter corrections, S = S( V , A ). Thus 
S can be differentiated with respect to V and A and set to zero for minimizing 


P V - A 7 A = o 

(43) 

- B 7 A = 0. 

(44) 


(43) and (44) are matrix forms of the normal equations. The weighting matrix 
P is always square and nonsingular such that 

V = P' 1 A T A. (45a) 


This leads to 


F +AP- 1 A T A + BA = 0, 


(40e) 


whence, 


A = - (A P- 1 A 7 )" 1 (BA+F). 


(46) 


Therefore, 


-B T A = -B T [-(AP- 1 A 7 )* 1 (BA + F)] =0 


(44a) 


or 


[B 7 (A P- 1 A 7 )" 1 B] A + B 7 (A P* 1 A 7 )" 1 F = 0 


(44b) 


yielding, 


A = [B 7 (AP- 1 A 7 )" 1 B]* 1 [B T (A P -1 A 7 ) -1 ] F 


(47) 


and 


V=P- 1 A 7 [-(AP- 1 A 7 )- 1 B{ [B t (A P” 1 A t )* 1 B]' 1 [B^AP-UV^Fl + F]. (45b) 
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Since 


A = 


\ 



\ 


and V = 


\ 



\ 


/ 


the values of the coefficient parameters become for the next iteration 


(a?) = a° + 

v j new j 


8 . 


(48) 


and for the coordinates 


(x?) = x? + 

v k new k 


(49) 


The 8 . are compared with a small finite limit e and v k are compared with 
another limit e . If the 8 and v. standard deviations are less than e, and 
e 2 repsectively, where an error analysis is performed on the inverted normal 
matrix to produce the standard deviations then the solution is complete; other- 
wise, (48) and (49) replace a° and x° respectively in (40b) and the algorithm 
proceeds as before. Iterations continue until and e 2 are reached. If, after 
a given number of iterations, e 2 is not achieved by the standard deviation of 
the residuals, those still greater than e 2 are dropped from the equations (40) 
and a last iteration is performed. 

In setting up the equations, if approximations for the a. and x k are initially 
made, acceptable answers may be achieved after only a few iterations; however, 
complete uncertainly may exist at first such that zero values are the approxi- 
mations. This means the iterative process will be longer, but residuals and 
corrections should converge. 

Other methods, differing in specific detail but following this general outline, 
exist in the literature. One increase in complexity occurs when the need for 
partitioning of all of the matrices is necessitated by use of different point types 
among the coordinates. 
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Obviously a computer routine is essential to make the above approach 
feasible. Indeed, for the examples given in this paper, the establishment of 
computer programs to perform the least squares algorithms is desirable and 
necessary to achieve the best utilization of this powerful concept. 
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APPENDIX 


NASA Oriented Application: Satellite Position 
and Range Determination 

To illustrate least squares applications in NASA oriented problems, consider 
the procedures involved in determining the position, in geocentric coordinates, 
of an orbiting satellite. If the satellite is observed simultaneously from sever- 
al ground stations, its position can thence be evaluated. Additionally, the range 
distance from each station to the satellite can be found. Many such determina- 
tions of satellite position within the same orbit enable the orbital parameters to 
be computed. 

Notational definitions: 

Let X - , Y. , and be the Cartesian geocentric coordinates of the J th 
station. 

Let x! . , y' . , and z' . be the geocentric coordinates of the satellite as 
observed at position i and as determined from station j . 

Let x , y , and z . be the local Cartesian coordinates of the satellite 
at position i with the local system origined at station j . 

Let x ; , y ; , and be the "true" (that is, best fit) position of the satellite 
in the geocentric coordinate system. 

In spherical coordinates the satellite at position i with respect to the sta- 
tion j local system is: 


X 

J 1 

II 

sin 

y Ji 

II 

sin 

z . 

J l 

= ^ii 

cos 



(50) 


where a is right ascension, cp is codeclination (90° — 8), 8 is declination, and 
p is range. 


Assume the geocentric coordinates Xj , , and Z. for each station j are 

known and the celestial angles <p . ; and a. . are measured. By requiring the axes 
of each local system to be parallel with the geocentric one, the following rela- 
tions hold: 
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( 51 ) 


/ 


X. . 

= X . 

J 1 

3 1 

y'a 

>r 

ii 

/ 

z 

- z . 

j i 

3 1 


'i J 


Furthermore, the differences expressed as residuals in X, Y ,Z between "true" 
satellite positions in geocentric coordinates and station-relative satellite posi- 
tions in geocentric coordinates are: 


P\ 


A x. . 

= x! . - X 



j i 

J i i 



Ay-. 

y j i 

- y. • - y- 

j j i j i 

>■ 

(52) 

A z. . 

/ 

= Z. - - z. 



j i 

J 1 1 J 



SKI? 

3 

i + A y?i + < 


(53) 


Therefore, if we assume n ground stations simultaneously observe the satellite 
position i , 


P li 

s in 


cos 

a n 

+ X i 

~ X. 

i 

^2i 

s in 

¥21 

cos 

a 2i 

+ x 2 

~ x. 

1 

^ni 

s in 

¥ni 

cos 

a . 

ni 


~ x. 

1 

Pvx 

sin 

¥ii 

sin 

a H 

+ Y i 


Pli 

sin 

¥ 2 i 

sin 

a 2i 

+ Y 2 

rv, 

~ Yi 

Pni 

s in 

L 

sin 

a . 

m 

+ Y n 

• 

- y L 

Pli 


cos 

<Pu 


+ z x 

'X/ „ 

~ z i 


P 2i 

COS 

<p 2i 

+ z 2 

~ z • 

1 

Pni 

cos 

¥ni 


_ 

~ z i 


(54) 
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From these 3n equations per satellite position i we wish to determine the 3 + n 
unknowns per satellite position i . As before, expressing the approximate 
equations (as for (56)) in matrix form produces the condition equations. For our 
purposes, we wish to separate the unknowns p x . , p 2i , ... , p ni , x i , y i , and 
z. from the known factors. The resulting condition equation is: 



For convenience express (55a) as: 

S. R = X ( 55b ) 

11 1 

The solution of 55b) for R. is equivalent to a least squares evaluation of 



n 

[(Aji siny-i cos a.. +X. -x.) 2 + sin <p. . sin a. . +Y. -y^ 2 


(p ji costp.. 4 Z j - Zi ) 2 ] . 
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That is, the sum of the squares of the residuals Q. is minimized by finding the 
partials of Q. with respect to the unknowns evaluated as zero. e.g. 


3Qi 3Qi 
^Pli ' 3 P2i 


BQ i 

Bp . ~ BX. “BY" 

'n 111 



(57) 


Considering (55b) first, 

S T S. R. = S T X. (58a) 

11111 

which is, explicitly. 


(n + 3) x (n + 3) (n + 3) x 1 


n 

0 

0 

sin . cos a y , 

sin cp 2 . cos a 2i 

• s in 0 . cos a . 

T n i n l 


X 1 

0 

n 

0 

s in cpj. s in aj . 

s in cp 2 . s in a 2 . 

sin 9 ni sin a^. 


Y. 

0 

0 

n 

cos 

COS tjp 2i 

• cos cp . 

T n i 


Z ! 

sin <p u cos a l 

sin cp n sin a^. 

cos 

1 

0 

• 0 


— £ 

sin <p 2 . cos a 2 . 

sin <p 2 . sin a 2i 

cos 9 2i 

0 

1 

* • 0 


“^2i 

sin ( 2 ) . cos a . 

T m ni 

sin Cp sin a . 

T n i n i 

cos 

0 

0 

• • 1 


-^ni 
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Premultiplying (58) by the inverse of (ST SJ produces the desired solution for 
R. . 


(Sls i )- 1 sjs i * l =($s i r'& i x l 


(59a) 


R i = (ST s,)- 1 Sj X, 


(59b) 


The equation (58) may also be produced by straight forward evaluation of the 
partials in equation (57). 


^T = - 2 E (Pii Sin?ii COSa J* +X i ' Xi) = £>,. 

j =1 j=l 


cp. . cos a. . ) + 

T j i ji ; T 


n 

j=i 


T^: = ~ 2 E CP jl . in 9 Ji . in . J1+ Y J -y l )= 7 ^ (Pjj sin 9.. sin a..) + Y. 


3 Qi 

377 = - 2 4^ (p ii COS ^ii +Z > - z i)= <*„ c °s <P„) + > Zi -nz 4 =0 

j =1 j=l j=l 


9Qi 

- 2 (p.. sin 9^ cos a. . + X ; - x. ) sin ^ . cos a. . + 2 (p.. sin 9.. sin a.. 


(j - 1, 2, • • • ,n) Yj - y j ) sin 9. 4 sin a.. + 2 (p j; cos 9 ;i + Zj - z ; ) cos 9. . = 0 


ny. = 0 


y (60) 


By rearranging the terms of (60), i.e. 


(n equations 
J - 1*2, • • n 


n n 

nx i - J^ x i = J2 p ii sin ^i 

j=i j=i 


cos a. 


j i 
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n n 

ny . - P H sin <Pij sin a ji 


j=i j=i 


11 11 

nz > - 2>= 2Z Pi ‘ cosTii 

j=i j=i 


the relations represented in matrix notation in (58) can be readily obtained, 
once more justifying the matrix approach to least squares. 
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